Active deep learning on entity resolution by risk sampling
نویسندگان
چکیده
While the state-of-the-art performance on entity resolution (ER) has been achieved by deep learning, its effectiveness depends large quantities of accurately labeled training data. To alleviate data labeling burden, Active Learning (AL) presents itself as a feasible solution that focuses deemed useful for model training. Building upon recent advances in risk analysis ER, which can provide more refined estimate label misprediction than simpler classifier outputs, we propose novel AL approach sampling ER. Risk leverages estimation active instance selection. Based core-set characterization AL, theoretically derive an optimization aims to minimize loss with non-uniform Lipschitz continuity. Since defined weighted K-medoids problem is NP-hard, then present efficient heuristic algorithm. Finally, empirically verify efficacy proposed real comparative study. Our extensive experiments have shown it outperforms existing alternatives considerable margins.
منابع مشابه
Deep Active Learning for Named Entity Recognition
Deep neural networks have advanced the state of the art in named entity recognition. However, under typical training procedures, advantages over classical methods emerge only with large datasets. As a result, deep learning is employed only when large public datasets or a large budget for manually labeling data is available. In this work, we show that by combining deep learning with active learn...
متن کاملDeepER - Deep Entity Resolution
Entity resolution (ER) is a key data integration problem. Despite the efforts in 70+ years in all aspects of ER, there is still a high demand for democratizing ER – humans are heavily involved in labeling data, performing feature engineering, tuning parameters, and defining blocking functions. With the recent advances in deep learning, in particular distributed representation of words (a.k.a. w...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملCorefrence resolution with deep learning in the Persian Labnguage
Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...
متن کاملDeep Learning Quadcopter Control via Risk-Aware Active Learning
Modern optimization-based approaches to control increasingly allow automatic generation of complex behavior from only a model and an objective. Recent years has seen growing interest in fast solvers to also allow real-time operation on robots, but the computational cost of such trajectory optimization remains prohibitive for many applications. In this paper we examine a novel deep neural networ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge Based Systems
سال: 2022
ISSN: ['1872-7409', '0950-7051']
DOI: https://doi.org/10.1016/j.knosys.2021.107729